Bidirectional IOHMMs and Recurrent Neural Networks for Protein Secondary Structure Prediction

نویسندگان

  • Pierre Baldi
  • Soren Brunak
  • Paolo Frasconi
  • Gianluca Pollastri
چکیده

Prediction of protein secondary structure (SS) is one of the classical problems in bioinformatics that are best solved using computational prediction methods based on machine learning. Current state-of-the-art predictors are based on feedforward artificial neural networks fed by a fixed-width window of amino acids, centered on the predicted residue. Using a fixed-width small window offers the advantage of architectural simplicity and allows controlling parameter overfitting. On the other hand, relevant information is also contained in distant portions of the proteins and current methods cannot exploit this information. In this chapter, we describe two alternative architectures based on noncausal (bidirectional) dynamics. These architectures can be seen as generalizations of input-output hidden Markov models or recurrent neural networks. Unlike their conventional counterparts, their outputs depend on both upstream and downstream information. This novel algorithmic idea is a first step towards architectures capable of making predictions based on variable ranges of dependencies.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Combination of Support Vector Machines and Bidirectional Recurrent Neural Networks for Protein Secondary Structure Prediction

Predicting the secondary structure of a protein is a main topic in bioinformatics. A reliable predictor is needed by threading methods to improve the prediction of tertiary structure. Moreover, the predicted secondary structure content of a protein can be used to assign the protein to a specific folding class and thus estimate its function. We discuss here the use of support vector machines (SV...

متن کامل

Protein Secondary Structure Prediction with Long Short Term Memory Networks

Prediction of protein secondary structure from the amino acid sequence is a classical bioinformatics problem. Common methods use feed forward neural networks or SVM’s combined with a sliding window, as these models does not naturally handle sequential data. Recurrent neural networks are an generalization of the feed forward neural network that naturally handle sequential data. We use a bidirect...

متن کامل

Bidirectional segmented-memory recurrent neural network for protein secondary structure prediction

The formation of protein secondary structure especially the regions of β-sheets involves long-range interactions between amino acids. We propose a novel recurrent neural network architecture called Segmented-Memory Recurrent Neural Network (SMRNN) and present experimental results showing that SMRNN outperforms conventional recurrent neural networks on long-term dependency problems. In order to ...

متن کامل

Protein Structural Motif Prediction in Multidimensional φ-ψ Space Leads to Improved Secondary Structure Prediction

A significant step towards establishing the structure and function of a protein is the prediction of the local conformation of the polypeptide chain. In this article, we present systems for the prediction of three new alphabets of local structural motifs. The motifs are built by applying multidimensional scaling (MDS) and clustering to pair-wise angular distances for multiple φ-ψ angle values c...

متن کامل

Porter: a new, accurate server for protein secondary structure prediction

UNLABELLED Porter is a new system for protein secondary structure prediction in three classes. Porter relies on bidirectional recurrent neural networks with shortcut connections, accurate coding of input profiles obtained from multiple sequence alignments, second stage filtering by recurrent neural networks, incorporation of long range information and large-scale ensembles of predictors. Porter...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000